Search code, repositories, users, issues, pull requests...

Copy link

Member

@shnizzedy shnizzedy commented Jan 8, 2021 •

edited

Loading

Summary

Fixes #2982. Maybe fixes #3527.

~~(削除) All tests pass locally. ⁸/₁₃ jobs pass on Travis. The Travis failures seem unrelated to the changes in this PR. (削除ここまで)~~
After rebase, all tests pass.

List of changes proposed in this PR (pull-request)

Updates

nipype/nipype/utils/draw_gantt_chart.py

Line 391 in fa65caf

def generate_gantt_chart(

to handle changes to profiler
- filter log file to include only logged nodes with timing information
- convert datetime strings to datetime objects before doing datetime math
- for nodes with timing information but no name, use id or an empty string for name instead of crashing
- warn if a node is suspected of being included twice instead of raising an exception
- skip nodes with non-timestamp start and finish values like "N/A" and "Unknown"
  
  nipype/nipype/utils/draw_gantt_chart.py
  
  Lines 144 to 147 in a08ee57
  
  try:
  
  all_res += float(event[resource])
  
  except ValueError:
  
  next
  
  nipype/nipype/utils/draw_gantt_chart.py
  
  Lines 151 to 154 in a08ee57
  
  try:
  
  all_res -= float(event[resource])
  
  except ValueError:
  
  next
Adds a test

nipype/nipype/pipeline/plugins/tests/test_callback.py

Lines 66 to 98 in fa65caf

@pytest.mark.parametrize("plugin", ["Linear", "MultiProc", "LegacyMultiProc"])

def test_callback_gantt(tmpdir, plugin):

import logging

import logging.handlers

from os import path

from nipype.utils.profiler import log_nodes_cb

from nipype.utils.draw_gantt_chart import generate_gantt_chart

log_filename = path.join(tmpdir, "callback.log")

logger = logging.getLogger("callback")

logger.setLevel(logging.DEBUG)

handler = logging.FileHandler(log_filename)

logger.addHandler(handler)

# create workflow

wf = pe.Workflow(name="test", base_dir=tmpdir.strpath)

f_node = pe.Node(

niu.Function(function=func, input_names=[], output_names=[]), name="f_node"

)

wf.add_nodes([f_node])

wf.config["execution"] = {"crashdump_dir": wf.base_dir, "poll_sleep_duration": 2}

plugin_args = {"status_callback": log_nodes_cb}

if plugin != "Linear":

plugin_args["n_procs"] = 8

wf.run(plugin=plugin, plugin_args=plugin_args)

generate_gantt_chart(

path.join(tmpdir, "callback.log"), 1 if plugin == "Linear" else 8

)

assert path.exists(path.join(tmpdir, "callback.log.html"))

to make sure the Gantt chart HTML page generates without error
Adds myself as a contributor to .zenodo.json

Acknowledgment

(Mandatory) I acknowledge that this contribution will be available under the Apache 2 license.

callback.log.html screenshot

Copy link

Member Author

shnizzedy commented Jan 8, 2021

As noted

[T]here is an issue with the number of threads being estimated by the callback, or the gantt chart creation script is pulling in the wrong numbers. Some of the nodes are reporting using 210 threads!

Originally posted by @ccraddock in FCP-INDI/C-PAC#1404 (comment)

I thought maybe runtime_threads was counting something different than I expected.

I see the profile uses cpu_percent for runtime_threads which returns a percentage of a CPU, so I think something like math.ceil(cpu_percent)/100 would be an estimate of the number of threads, but there's some disconnected code that looks like it collects the actual number of threads used (as opposed to percentage of 1 CPU).

Originally posted by @shnizzedy in FCP-INDI/C-PAC#1404 (comment)

I think estimating the number of threads (by dividing by cpu_percent 100 and rounding up) is good enough for what I'm trying to do.

Originally posted by @shnizzedy in FCP-INDI/C-PAC#1404 (comment)

I think the issues of

what runtime_threads is logging and
whether the number of threads used by a node is recorded

are related to this PR and issue, but beyond the scope of these changes. C-PAC has its own callback function in which I'm dividing and rounding, so I made no changes regarding runtime_threads in Nipype.

@shnizzedy shnizzedy mentioned this pull request

Jan 8, 2021

generate_gantt_chart fails on logfile #2982

Closed

@codecov

Copy link

codecov bot commented Jan 8, 2021 •

edited

Loading

Codecov Report

Attention: Patch coverage is 95.38462% with 3 lines in your changes missing coverage. Please review.

Project coverage is 73.15%. Comparing base (5dc8701) to head (7223914).
Report is 50 commits behind head on master.

Files with missing lines	Patch %	Lines
nipype/utils/draw_gantt_chart.py	90.32%	3 Missing ⚠️

Additional details and impacted files

@@ Coverage Diff @@
## master #3290 +/- ##
==========================================
+ Coverage 72.86% 73.15% +0.28% 
==========================================
 Files 1278 1278 
 Lines 59305 59356 +51 
==========================================
+ Hits 43212 43419 +207 
+ Misses 16093 15937 -156

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@shnizzedy shnizzedy mentioned this pull request

Jan 29, 2021

⚡️ Update memory and threading estimates FCP-INDI/C-PAC#1428

Merged

8 tasks

@shnizzedy shnizzedy mentioned this pull request

Mar 4, 2021

🔇 Comment out runtime_threads ⩼ threads FCP-INDI/C-PAC#1457

Merged

8 tasks

effigies

effigies approved these changes

Mar 31, 2021

Copy link

Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, though I don't have any experience with this bit of the code. Inclined to merge tomorrow unless someone complains.

Copy link

Member Author

shnizzedy commented Apr 1, 2021

My only hesitance is the potentially misleading runtime_threads ― maybe that should be fixed before restoring this functionality?

mgxd

mgxd reviewed

Apr 1, 2021

Copy link

Member

@mgxd mgxd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some minor nits.

My only hesitance is the potentially misleading runtime_threads ― maybe that should be fixed before restoring this functionality?

I agree 👍

nipype/utils/draw_gantt_chart.py Outdated Show resolved Hide resolved

nipype/pipeline/plugins/tests/test_callback.py Outdated Show resolved Hide resolved

nipype/utils/draw_gantt_chart.py Outdated Show resolved Hide resolved

shnizzedy added a commit to shnizzedy/nipype that referenced this pull request

Apr 1, 2021


 🎨 next ≠ continue

25dd1fc

Ref nipy#3290 (comment), nipy#3290 (comment)
Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>

effigies

effigies reviewed

Apr 5, 2021

nipype/pipeline/plugins/tests/test_callback.py Show resolved Hide resolved

Copy link

Member

effigies commented Apr 30, 2021

My only hesitance is the potentially misleading runtime_threads ― maybe that should be fixed before restoring this functionality?

I agree

Was this fixed? What needs doing?

nipype/nipype/utils/profiler.py

Copy link

Member Author

shnizzedy commented May 3, 2021

Was this fixed? What needs doing?

I haven't fixed it (yet at least). The issue is that the chart uses runtime_threads from the callback log as a count of threads observed being used at runtime, but the value actually stored there is cpu_percent,

Line 143 in e9217c2

"runtime_threads": getattr(node.result.runtime, "cpu_percent", "N/A"),

a float representing the current process CPU utilization as a percentage

This leads to thread counts in the hundreds when they're expected to be in the ones, like Gantt chart screenshot with CPU percent in "Threads"

So I think the "threads" part of these charts should be changed before the chart functionality is restored, either

by updating the log to include an integer count of threads and use this value in the chart
change the column from threads to CPU percentage
something else?

Copy link

Member

effigies commented May 6, 2021

Yeah, seems like we want something like:

if status_dict['runtime_threads'] != "N/A":
 status_dict['runtime_threads'] //= 100

nipype/nipype/interfaces/base/tests/test_resource_monitor.py

Copy link

Member Author

shnizzedy commented May 6, 2021

An existing unit test does

Lines 76 to 78 in 6c06030

assert (

int(result.runtime.cpu_percent / 100 + 0.2) == n_procs

), "wrong number of threads estimated"

which is similar to what we're doing for now in C-PAC:

if runtime_threads != 'N/A':
 runtime_threads = math.ceil(runtime_threads/100)

My concern is that, as I read

Note: the returned value can be > 100.0 in case of a process running multiple threads on different CPU cores.
Note: the returned value is explicitly not split evenly between all available CPUs (differently from psutil.cpu_percent()). This means that a busy loop process running on a system with 2 logical CPUs will be reported as having 100% CPU utilization instead of 50%. This was done in order to be consistent with top UNIX utility and also to make it easier to identify processes hogging CPU resources independently from the number of CPUs. It must be noted that taskmgr.exe on Windows does not behave like this (it would report 50% usage instead). To emulate Windows taskmgr.exe behavior you can do: p.cpu_percent() / psutil.cpu_count().

― psutil documentation: Process.cpu_percent

this number can be a misleading estimate. For example, if a process is using 25% of each of 4 CPUs, I believe this would report 100%, which would reduce to 1 or 2 threads depending on how we're rounding up or not. I'd be happy to learn that either I'm misunderstanding the number or that the number is good enough.

@shnizzedy shnizzedy mentioned this pull request

Jan 31, 2023

BUG: Reading serialized event requires conversion of dates #3528

Merged

Copy link

Member

effigies commented Nov 18, 2024

@shnizzedy Can you rebase/merge master to resolve conflicts? I think we let this go too long and should just merge and let people find bugs and fix them.

shnizzedy added a commit to shnizzedy/nipype that referenced this pull request


 🎨 next ≠ continue

fde3e74

Ref nipy#3290 (comment), nipy#3290 (comment)
Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>

@shnizzedy shnizzedy force-pushed the fix/gantt-chart branch from 933fad3 to 6490708 Compare

November 18, 2024 14:53

shnizzedy and others added 12 commits

November 18, 2024 10:03


 FIX: Convert timing values to datetimes from strings

621c894

* exclude nodes without timing information from Gantt chart
* fall back on "id" or empty string if no "name" in node


 REF: Reduce double logging from exception to warning

2cf2d37


 TST: Add test for draw_gantt_chart

2e50f46


 STY: Automatic linting by pre-commit


 TST: Use tmpdir for Gantt test

ea4def1


 REF: Don't restrict nan timestamps to predetermined options

169c09e


 STY: Simplify warning

9637b0f

Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>


 REF: Remove unnecessary import

f336c22

Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>


 FIX: next ≠ continue

d76af57

Ref nipy#3290 (comment), nipy#3290 (comment)
Co-authored-by: Mathias Goncalves <goncalves.mathias@gmail.com>

@shnizzedy @effigies


 TST: Skip test that requires pandas if pandas not installed

a80923f

Co-authored-by: Chris Markiewicz <effigies@gmail.com>

@effigies @shnizzedy


 TEST: Add pandas import check

9096a5b

@effigies @shnizzedy


 STY: black

b1690d5

effigies and others added 2 commits

November 18, 2024 10:06

@effigies @shnizzedy


 STY/TEST: black and skipif syntax

de6657e


 STY: Fix typo (co{^n}vert)

6830e3a

@shnizzedy shnizzedy force-pushed the fix/gantt-chart branch from 6490708 to 6830e3a Compare

November 18, 2024 15:51


 FIX: Don't try to strptime something that's already a datetime

376d6e2

@shnizzedy shnizzedy force-pushed the fix/gantt-chart branch 2 times, most recently from e644bdd to 1bce774 Compare

November 18, 2024 17:17

effigies

effigies reviewed

nipype/pipeline/plugins/tests/test_callback.py Outdated Show resolved Hide resolved


 TEST: Update Gantt chart tests for coverage

19a0355

@shnizzedy shnizzedy force-pushed the fix/gantt-chart branch from 1bce774 to 19a0355 Compare

November 18, 2024 17:26

@shnizzedy shnizzedy marked this pull request as draft

November 18, 2024 17:33


 Merge branch 'master' into fix/gantt-chart

19078a4

shnizzedy added a commit to shnizzedy/nipype that referenced this pull request


 REF: Require Pandas for tests

2b6b4f2

Ref nipy#3290 (comment)

shnizzedy added 3 commits

November 18, 2024 14:10


 REF: Require Pandas for tests

73f657f

Ref nipy#3290 (comment)


 REF: 3.9-friendly typing.Union

8329d08


 REF: Handle absence/presence of tzinfo

4c0835f

@shnizzedy shnizzedy force-pushed the fix/gantt-chart branch from b35aa95 to 4c0835f Compare

November 18, 2024 19:11


 FIX: Drop pandas ceiling

12b6e37

effigies

effigies approved these changes

Copy link

Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing tests! Small comments.

nipype/info.py Outdated Show resolved Hide resolved

nipype/pipeline/plugins/tests/test_callback.py Outdated Show resolved Hide resolved

nipype/pipeline/plugins/tests/test_callback.py Show resolved Hide resolved

shnizzedy and others added 2 commits

November 18, 2024 14:27

@shnizzedy @effigies


 REF: ≥ 1.5.0

a693e12

Co-authored-by: Chris Markiewicz <effigies@gmail.com>

@shnizzedy @effigies


 FIX: Too much indentation

Co-authored-by: Chris Markiewicz <effigies@gmail.com>

Copy link

Member Author

shnizzedy commented Nov 18, 2024

Do we want to do this

if status_dict['runtime_threads'] != "N/A":
 status_dict['runtime_threads'] //= 100

in this PR or kick the can on it?

Copy link

Member

effigies commented Nov 18, 2024

Let's kick the can. If you want to open another PR in 5 minutes, that's fine with me. Last time we had that question, it delayed things 3 years.

@effigies effigies marked this pull request as ready for review

November 18, 2024 19:38

@effigies effigies merged commit 2e36f69 into nipy:master

25 checks passed

@effigies effigies mentioned this pull request